AITopics | output sample

Collaborating Authors

output sample

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fake-in-Facext: Towards Fine-Grained Explainable DeepFake Analysis

Qin, Lixiong, Zhang, Yang, Wang, Mei, Hu, Jiani, Deng, Weihong, Xu, Weiran

arXiv.org Artificial IntelligenceOct-24-2025

The advancement of Multimodal Large Language Models (MLLMs) has bridged the gap between vision and language tasks, enabling the implementation of Explainable DeepFake Analysis (XDFA). However, current methods suffer from a lack of fine-grained awareness: the description of artifacts in data annotation is unreliable and coarse-grained, and the models fail to support the output of connections between textual forgery explanations and the visual evidence of artifacts, as well as the input of queries for arbitrary facial regions. As a result, their responses are not sufficiently grounded in Face Visual Context (Facext). To address this limitation, we propose the Fake-in-Facext (FiFa) framework, with contributions focusing on data annotation and model construction. We first define a Facial Image Concept Tree (FICT) to divide facial images into fine-grained regional concepts, thereby obtaining a more reliable data annotation pipeline, FiFa-Annotator, for forgery explanation. Based on this dedicated data annotation, we introduce a novel Artifact-Grounding Explanation (AGE) task, which generates textual forgery explanations interleaved with segmentation masks of manipulated artifacts. We propose a unified multi-task learning architecture, FiFa-MLLM, to simultaneously support abundant multimodal inputs and outputs for fine-grained Explainable DeepFake Analysis. With multiple auxiliary supervision tasks, FiFa-MLLM can outperform strong baselines on the AGE task and achieve SOTA performance on existing XDFA datasets. The code and data will be made open-source at https://github.com/lxq1000/Fake-in-Facext.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.20531

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Enhanced Transformer architecture for in-context learning of dynamical systems

Rufolo, Matteo, Piga, Dario, Maroni, Gabriele, Forgione, Marco

arXiv.org Artificial IntelligenceOct-4-2024

Recently introduced by some of the authors, the in-context identification paradigm aims at estimating, offline and based on synthetic data, a meta-model that describes the behavior of a whole class of systems. Once trained, this meta-model is fed with an observed input/output sequence (context) generated by a real system to predict its behavior in a zero-shot learning fashion. In this paper, we enhance the original meta-modeling framework through three key innovations: by formulating the learning task within a probabilistic framework; by managing non-contiguous context and query windows; and by adopting recurrent patching to effectively handle long context sequences. The efficacy of these modifications is demonstrated through a numerical example focusing on the Wiener-Hammerstein system class, highlighting the model's enhanced performance and scalability.

architecture, initial condition, sequence, (16 more...)

arXiv.org Artificial Intelligence

2410.03291

Country:

Europe > Switzerland (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-Intent Detection in User Provided Annotations for Programming by Examples Systems

Kumar, Nischal Ashok, Gupta, Nitin, Guttula, Shanmukha, Patel, Hima

arXiv.org Artificial IntelligenceJul-8-2023

In mapping enterprise applications, data mapping remains a fundamental part of integration development, but its time consuming. An increasing number of applications lack naming standards, and nested field structures further add complexity for the integration developers. Once the mapping is done, data transformation is the next challenge for the users since each application expects data to be in a certain format. Also, while building integration flow, developers need to understand the format of the source and target data field and come up with transformation program that can change data from source to target format. The problem of automatic generation of a transformation program through program synthesis paradigm from some specifications has been studied since the early days of Artificial Intelligence (AI). Programming by Example (PBE) is one such kind of technique that targets automatic inferencing of a computer program to accomplish a format or string conversion task from user-provided input and output samples. To learn the correct intent, a diverse set of samples from the user is required. However, there is a possibility that the user fails to provide a diverse set of samples. This can lead to multiple intents or ambiguity in the input and output samples. Hence, PBE systems can get confused in generating the correct intent program. In this paper, we propose a deep neural network based ambiguity prediction model, which analyzes the input-output strings and maps them to a different set of properties responsible for multiple intent. Users can analyze these properties and accordingly can provide new samples or modify existing samples which can help in building a better PBE system for mapping enterprise applications.

ambiguity, annotation, pbe system, (14 more...)

arXiv.org Artificial Intelligence

2307.03966

Country:

Asia > India (0.04)
North America > United States (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Multiple output samples for each input in a single-output Gaussian process

Wong, Jeremy H. M., Zhang, Huayun, Chen, Nancy F.

arXiv.org Artificial IntelligenceJun-5-2023

The standard Gaussian Process (GP) only considers a single output sample per input in the training set. Datasets for subjective tasks, such as spoken language assessment, may be annotated with output labels from multiple human raters per input. This paper proposes to generalise the GP to allow for these multiple output samples in the training set, and thus make use of available output uncertainty information. This differs from a multi-output GP, as all output samples are from the same task here. The output density function is formulated to be the joint likelihood of observing all output samples, and latent variables are not repeated to reduce computation cost. The test set predictions are inferred similarly to a standard GP, with a difference being in the optimised hyper-parameters. This is evaluated on speechocean762, showing that it allows the GP to compute a test set output distribution that is more similar to the collection of reference outputs from the multiple human raters.

artificial intelligence, machine learning, output sample, (18 more...)

arXiv.org Artificial Intelligence

2306.02719

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)
(11 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Step-by-step guide on how to train GPT-2 on books using Google Colab

#artificialintelligenceJul-27-2020, 18:01:16 GMT

We will use Google Drive to save our checkpoints (a checkpoint is our last saved trained model). Once our trained model is saved we can load it whenever we want to generate both conditional and unconditional texts. Now that you have your Google Drive connected let's create a checkpoints folder: Now let's clone the GPT-2 repository that we will use, which is forked from nnsheperd's awesome repository (which is forked from OpenAI's but with the awesome addition of train.py), I have added a conditional_model() method which will let us pass multiple sentences at once and return a dictionary with the relevant model output samples. It also lets us avoid using bash-code.

large language model, machine learning, natural language, (17 more...)

#artificialintelligence

Genre:

Workflow (0.51)
Instructional Material > Training Manual (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Involutive MCMC: a Unifying Framework

Neklyudov, Kirill, Welling, Max, Egorov, Evgenii, Vetrov, Dmitry

arXiv.org Machine LearningJun-30-2020

Name & Citation Appendix Metropolis-Hastings (Hastings, 1970) B.1 Markov Chain Monte Carlo (MCMC) is a computational Mixture Proposal (Habib & Barber, 2018) B.2 approach to fundamental problems such Multiple-Try Metropolis (Liu et al., 2000) B.3 as inference, integration, optimization, and simulation. Sample-Adaptive MCMC (Zhu, 2019) B.4 The field has developed a broad spectrum Reversible-Jump MCMC (Green, 1995) B.5 of algorithms, varying in the way they are motivated, Hybrid Monte Carlo (Duane et al., 1987) B.6 the way they are applied and how efficiently RMHMC (Girolami & Calderhead, 2011) B.7 they sample. Despite all the differences, many of NeuTra (Hoffman et al., 2019) B.8 them share the same core principle, which we A-NICE-MC (Song et al., 2017) B.9 unify as the Involutive MCMC (iMCMC) framework. L2HMC (Levy et al., 2017) B.10 Building upon this, we describe a wide Persistent HMC (Horowitz, 1991) B.11 range of MCMC algorithms in terms of iMCMC, Gibbs (Geman & Geman, 1984) B.12 and formulate a number of "tricks" which one Look Ahead (Sohl-Dickstein et al., 2014) B.13 can use as design principles for developing new NRJ (Gagnon & Doucet, 2019) B.14 MCMC algorithms. Thus, iMCMC provides a Lifted MH (Turitsyn et al., 2011) B.15 unified view of many known MCMC algorithms, which facilitates the derivation of powerful extensions. Table 1: List of algorithms that we describe by the Involutive We demonstrate the latter with two MCMC framework. See their descriptions and formulations examples where we transform known reversible in terms of iMCMC in corresponding appendices.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2006.16653

Country:

Europe > Austria > Vienna (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Learning Generative Models of Structured Signals from Their Superposition Using GANs with Application to Denoising and Demixing

Soltani, Mohammadreza, Jain, Swayambhoo, Sambasivan, Abhinav

arXiv.org Machine LearningFeb-12-2019

In general the separation problem is inherently ill-posed; however, with enough structural assumption on X and N, it has been established that separation is possible. Depending on the application one might be interested in estimating only X (in this case, N is considered as the corruption), which is referred to as denoising, or in recovering both X and N which is referred to as demixing. Both demixing and denoising arise in a variety of important practical applications in the areas of signal/image processing, computer vision, machine learning, and statistics [Chen et al., 2001, Elad et al., 2005, Bobin et al., 2007, Candès et al., 2011]. Most of the existing techniques assume some prior knowledge on the structures of X and N in order to recover the desired component signal(s). Prior knowledge about the structure of X and N can only be obtained if one has access to the generative mechanism of the signals or has access to clean samples from the probability distribution defined over sets X and N . In many practical settings, neither of these may be feasible. In this paper, we consider the problem of separating constituent signals from superposed observations when clean access to samples from the distribution is not available.

dataset, experiment, generator, (17 more...)

arXiv.org Machine Learning

1902.04664

Country: Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Learning with Weak Supervision from Physics and Data-Driven Constraints

Ren, Hongyu (Peking University) | Stewart, Russell (Stanford University) | Song, Jiaming (Stanford University) | Kuleshov, Volodymyr (Stanford University) | Ermon, Stefano (Stanford University)

AI MagazineMar-27-2018

In many applications of machine learning, labeled data is scarce and obtaining additional labels is expensive. We introduce a new approach to supervising learning algorithms without labels by enforcing a small number of domain-specific constraints over the algorithms’ outputs. The constraints can be provided explicitly based on prior knowledge — e.g. we may require that objects detected in videos satisfy the laws of physics — or implicitly extracted from data using a novel framework inspired by adversarial training. We demonstrate the effectiveness of constraint-based learning on a variety of tasks — including tracking, object detection, and human pose estimation — and we find that algorithms supervised with constraints achieve high accuracies with only a small amount of labels, or with no labels at all in some cases.

artificial intelligence, constraint, machine learning, (17 more...)

AI Magazine

Country:

North America > United States > New York (0.15)
North America > Canada > Quebec (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

Raw Waveform-based Speech Enhancement by Fully Convolutional Networks

Fu, Szu-Wei, Tsao, Yu, Lu, Xugang, Kawai, Hisashi

arXiv.org Machine LearningJun-15-2017

This study proposes a fully convolutional network (FCN) model for raw waveform-based speech enhancement. The proposed system performs speech enhancement in an end-to-end (i.e., waveform-in and waveform-out) manner, which dif-fers from most existing denoising methods that process the magnitude spectrum (e.g., log power spectrum (LPS)) only. Because the fully connected layers, which are involved in deep neural networks (DNN) and convolutional neural networks (CNN), may not accurately characterize the local information of speech signals, particularly with high frequency components, we employed fully convolutional layers to model the waveform. More specifically, FCN consists of only convolutional layers and thus the local temporal structures of speech signals can be efficiently and effectively preserved with relatively few weights. Experimental results show that DNN- and CNN-based models have limited capability to restore high frequency components of waveforms, thus leading to decreased intelligibility of enhanced speech. By contrast, the proposed FCN model can not only effectively recover the waveforms but also outperform the LPS-based DNN baseline in terms of short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ). In addition, the number of model parameters in FCN is approximately only 0.2% compared with that in both DNN and CNN.

neural network, speech enhancement, waveform, (15 more...)

arXiv.org Machine Learning

1703.02205

Country:

North America > United States (0.28)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
Europe > Hungary (0.04)
(3 more...)

Genre: Research Report > New Finding (0.89)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback